Large Scale Disease Prediction

نویسنده

  • Candice Chow
چکیده

The objective of this thesis is to present the foundation of an automated large-scale disease prediction system. Unlike previous work that has typically focused on a small self-contained dataset, we explore the possibility of combining a large amount of heterogenous data to perform gene selection and phenotype classification. First, a subset of publicly available microarray datasets was downloaded from the NCBI Gene Expression Omnibus (CEO) [18, 5]. This data was then automatically tagged with Unified Medical Language System (UMLS) concepts [7]. Using the UMLS tags, datasets related to several phenotypes were obtained and gene selection was performed on the expression values of this tagged micrarray data. Using the tagged datasets and the list of genes selected in the previous step, classifiers that can predict whether or not a new sample is also associated with a given UMLS concept based solely on the expression data were created. The results from this work show that it is possible to combine a large heterogenous set of microarray datasets for both gene selection and phenotype classification, and thus lays the foundation for the possibility ofautomatic classification of disease types based on gene expression data in a clinical setting. Thesis Supervisor: Bonnie Berger Title: Professor

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Prediction of Corona disease anxiety based on health locus of control and perceived social support

Nowadays, Corona disease anxiety is becoming common among people and has become a major concern for mental health. People perceive Corona disease anxiety differently depending on their individual characteristics. Therefore, the aim of this study was the prediction of Corona disease anxiety based on the health locus of control and perceived social support. The method of the present study was des...

متن کامل

Using Combined Descriptive and Predictive Methods of Data Mining for Coronary Artery Disease Prediction: a Case Study Approach

Heart disease is one of the major causes of morbidity in the world. Currently, large proportions of healthcare data are not processed properly, thus, failing to be effectively used for decision making purposes. The risk of heart disease may be predicted via investigation of heart disease risk factors coupled with data mining knowledge. This paper presents a model developed using combined descri...

متن کامل

The Role of Personality traits in Prediction of Hope in Men with Cardiovascular Disease

Background: The present study has been done with the primary goal of investigating the role of personality traits in predicting hope in men with cardiovascular disease. Methods:  The type of current research was correlation a land the study population consisted of 200 men with cardiovascular disease chosen by convenience sampling method from people referring to the medical centers in Tehran. F...

متن کامل

A Likelihood-Free Approach for Characterizing Heterogeneous Diseases in Large-Scale Studies

We propose a non-parametric approach for characterizing heterogeneous diseases in large-scale studies. We target diseases where multiple types of pathology present simultaneously in each subject and a more severe disease manifests as a higher level of tissue destruction. For each subject, we model the collection of local image descriptors as samples generated by an unknown subject-specific prob...

متن کامل

Large-scale prediction of microRNA-disease associations by combinatorial prioritization algorithm

Identification of the associations between microRNA molecules and human diseases from large-scale heterogeneous biological data is an important step for understanding the pathogenesis of diseases in microRNA level. However, experimental verification of microRNA-disease associations is expensive and time-consuming. To overcome the drawbacks of conventional experimental methods, we presented a co...

متن کامل

Prediction of foundations behavior by a stress level based hyperbolic soil model and the ZEL method

In shallow foundations, the third bearing capacity factor, N, has been found to show a decreasing tendency with increasing the foundation size. It is supported by experimental observations and related mainly to stress level dependent nature of the soil. On the other hand, the bearing capacity is often obtained theoretically without consideration of the foundation vertical displacements. In thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014